-
-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load internal json v3 #16638
Load internal json v3 #16638
Conversation
Early benchmarks on my super old iMac show that this is noticeably faster. I'll try it on a newer machine at some point to get more relevant information. Note that $ hyperfine --parameter-list branch master,load-internal-json-v3 --warmup 3 --setup 'git switch {branch}' 'HOMEBREW_INTERNAL_JSON_V3=1 brew info jrnl'
Benchmark 1: HOMEBREW_INTERNAL_JSON_V3=1 brew info jrnl (branch = master)
Time (mean ± σ): 3.164 s ± 0.020 s [User: 2.086 s, System: 0.915 s]
Range (min … max): 3.133 s … 3.196 s 10 runs
Benchmark 2: HOMEBREW_INTERNAL_JSON_V3=1 brew info jrnl (branch = load-internal-json-v3)
Time (mean ± σ): 2.731 s ± 0.020 s [User: 1.694 s, System: 0.877 s]
Range (min … max): 2.710 s … 2.783 s 10 runs
Summary
HOMEBREW_INTERNAL_JSON_V3=1 brew info jrnl (branch = load-internal-json-v3) ran
1.16 ± 0.01 times faster than HOMEBREW_INTERNAL_JSON_V3=1 brew info jrnl (branch = master) |
Wow, that's huge! Great work @apainintheneck! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far!
df2bc06
to
2a4c716
Compare
Okay, I finally got around to adding some tests. There are tests for both generating the JSON format and loading formulae from the JSON file. I ended up creating a mini-tap fixture to facilitate testing as well. I assume grabbing a few random formulae for that is not a big deal. Let me know if there's anything else that would be good to test for. Edit: I guess I'll be debugging the cross-platform test failures for a bit... |
a155b13
to
c0ed1dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @apainintheneck 👏🏻
"uses_from_macos" => [{ "llvm"=>[:build, :test] }, "zlib"], | ||
"uses_from_macos_bounds" => [{}, {}], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to increase up scope here potentially (and can be in another PR) but: this API is fairly awful due to backwards compatibility concerns we don't have for v3 so would be nice to address. CC @Bo98 for thoughts on what that should look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we were going to cycle back and clean this up. I can open a PR with suggestions (probably easier after merging this PR though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool: given that: fine to merge with this as-is in this PR 👍🏻. A TODO or something might be nice though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can a TODO be added here please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a note in the original issue thread: #16410 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems weird to leave a comment when any potential changes are exploratory. We haven't decided on the form that we want yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apainintheneck It's a hard TODO
because we're not going to ship the internal JSON v3 without changing this. I'm fine on not blocking the PR on this stuff but I would really like something beyond a comment in another issue to track this in the actual code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"uses_from_macos" => [{ "llvm"=>[:build, :test] }, "zlib"], | |
"uses_from_macos_bounds" => [{}, {}], | |
# TODO: improve this API before we ship internal API v3 to users | |
"uses_from_macos" => [{ "llvm"=>[:build, :test] }, "zlib"], | |
"uses_from_macos_bounds" => [{}, {}], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apainintheneck It's a hard
TODO
because we're not going to ship the internal JSON v3 without changing this. I'm fine on not blocking the PR on this stuff but I would really like something beyond a comment in another issue to track this in the actual code.
I'm fine with the comment but this is news to me. I didn't know that this was that important either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for my unclear communication! I'm trying to get the balance right between being overly pedantic/demotivating and blocking individual PRs vs. making clear what's required before we ship to users. Great work getting this shipped, thanks @apainintheneck ❤️
"tap_git_head" => tap_git_head, | ||
"uses_from_macos" => [{ "llvm"=>[:build, :test] }, "zlib"], | ||
"uses_from_macos_bounds" => [{}, {}], | ||
"versions" => { "bottle"=>true, "head"=>nil, "stable"=>"0.58.1" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"versions" => { "bottle"=>true, "head"=>nil, "stable"=>"0.58.1" }, | |
"versions" => { "bottle"=>true, "stable"=>"0.58.1" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can probably compact or avoid adding that field in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, this only exists in the public facing JSON and should already be omitted from the internal one. I'm just using the public JSON here for testing because it exercises more parts of the formula code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh, I think it'd be preferable to use the internal JSON and to get full coverage of the formula code through the input JSON.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh, I think it'd be preferable to use the internal JSON and to get full coverage of the formula code through the input JSON.
Sorry, I don't know what you mean by this. How can we get full coverage of the formula code through the input JSON? What input JSON are we referring to here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saying it seems very weird to be to use the non-internal public JSON for testing the internal JSON format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that weird. It essentially mimics brew readall
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apainintheneck Ok but aren't these JSON formats going to diverge significantly?
I think I might just need an ELI5 treatment here, sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically a fancy to_hash
test. We could do it as a series of checks instead, though would be multiple expects
in a single check so might be long:
expect(formula.name).to eq("something")
expect(formula.version.to eq("1.0")
expect(formula.revision).to be_zero
# ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#to_hash
is a bit better than doing a bunch of individual checks since it will end up calling more methods internally to fill out the fields we're not checking.
The goal here is not to validate the #to_hash
method but instead to be reasonably confident that the formula loads without failing. #to_hash
is just an easy way to take a peek inside the formula instance and run some arbitrary methods along the way. I'm open to alternatives that would be clearer, briefer or provide more assurance of correctness. The current test is not comprehensive at all.
#to_api_hash
is already being exercised by a different test where we generate the JSON API for the core formula tap.
Library/Homebrew/test/support/fixtures/internal_tap_json/homebrew-core/Formula/f/fennel.rb
Show resolved
Hide resolved
064cdf9
to
078c46a
Compare
9134e7e
to
6834f2a
Compare
This should be ready for review again. The test failures were caused by caching problems between test runs. I just needed to clear the caches a bit more between them and add one more variable to the I'm leaving this as draft until after the Monday release since I'd like to be cautious and a have a few days before it's available to everyone to avoid any unexpected bugs. |
Library/Homebrew/test/support/fixtures/internal_tap_json/homebrew-core.json
Outdated
Show resolved
Hide resolved
"uses_from_macos" => [{ "llvm"=>[:build, :test] }, "zlib"], | ||
"uses_from_macos_bounds" => [{}, {}], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can a TODO be added here please.
"tap_git_head" => tap_git_head, | ||
"uses_from_macos" => [{ "llvm"=>[:build, :test] }, "zlib"], | ||
"uses_from_macos_bounds" => [{}, {}], | ||
"versions" => { "bottle"=>true, "head"=>nil, "stable"=>"0.58.1" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saying it seems very weird to be to use the non-internal public JSON for testing the internal JSON format?
Here are the same benchmarks on my M1 MacBook that I use at work. Still significant performance boost though not as big as the one on my older computer. Percentage-wise it's more significant though. $ hyperfine --parameter-list branch master,load-internal-json-v3 --warmup 3 --setup 'git switch {branch}' 'HOMEBREW_INTERNAL_JSON_V3=1 brew info jrnl'
Benchmark 1: HOMEBREW_INTERNAL_JSON_V3=1 brew info jrnl (branch = master)
Time (mean ± σ): 1.079 s ± 0.010 s [User: 0.738 s, System: 0.164 s]
Range (min … max): 1.070 s … 1.096 s 10 runs
Benchmark 2: HOMEBREW_INTERNAL_JSON_V3=1 brew info jrnl (branch = load-internal-json-v3)
Time (mean ± σ): 876.8 ms ± 10.4 ms [User: 554.8 ms, System: 146.4 ms]
Range (min … max): 867.7 ms … 895.6 ms 10 runs
Summary
HOMEBREW_INTERNAL_JSON_V3=1 brew info jrnl (branch = load-internal-json-v3) ran
1.23 ± 0.02 times faster than HOMEBREW_INTERNAL_JSON_V3=1 brew info jrnl (branch = master) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks @apainintheneck. A couple of small requests but I'm fine if this is merged as-is and they are addressed in follow-up PRs.
"uses_from_macos" => [{ "llvm"=>[:build, :test] }, "zlib"], | ||
"uses_from_macos_bounds" => [{}, {}], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apainintheneck I'd really love an inline Ruby comment here if possible.
], | ||
"version": "0.58.1", | ||
"bottle": { | ||
"files": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could maybe strip this key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a result of trying to reuse the logic we currently have for building the bottle hash. The bottle hash will occasionally have another key if the rebuild number is non-zero as well.
hash["rebuild"] = bottle_spec.rebuild if !compact_for_api || !bottle_spec.rebuild.zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks, that's fine then 👍🏻
This abstracts away this helper to make it easier to test and reason about.
These tests cover both generating and loading formulae from the JSON bundle. The tests are not comprehensive but they do provide a nice sanity check that things are working as expected.
- clear the formula API cache - make the API cache directory - fix stubbed return values (thanks Sorbet!)
4a8b201
to
f1f2e24
Compare
I'll merge this in tomorrow. |
brew style
with your changes locally?brew typecheck
with your changes locally?brew tests
with your changes locally?This is an early PR to start loading the internal JSON v3 representation added in #16541. The hope is that this will not only load faster than the previous one but also will reduce the amount of data that has to flow across the network and reduce the amount of code needed to turn the current JSON representation into the expected shape internally.
My first thought is that the amount of code needed to get this to work is way less than I originally thought. Of course, no tests so far so this will grow a bit.